Add retrieval support in langchain#124
Conversation
…python-genai into langchain-retrieval
…python-genai into langchain-retrieval
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds LangChain retriever (vector-store retrieval) instrumentation coverage, including callback handler support and end-to-end validation via integration + conformance tests.
Changes:
- Implemented
on_retriever_start/on_retriever_end/on_retriever_errorin the LangChain callback handler. - Added integration tests for retriever spans/attributes/metrics and callback-handler unit tests for retrieval lifecycle.
- Added a conformance retrieval scenario and registered it in the conformance test suite; updated dependency + changelog.
Reviewed changes
Copilot reviewed 7 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| instrumentation/opentelemetry-instrumentation-genai-langchain/src/opentelemetry/instrumentation/genai/langchain/callback_handler.py | Adds retriever lifecycle callbacks and maps retrieved documents into invocation data. |
| instrumentation/opentelemetry-instrumentation-genai-langchain/tests/test_callback_handler.py | Adds unit tests validating retriever callback behavior and state management. |
| instrumentation/opentelemetry-instrumentation-genai-langchain/tests/test_retriever.py | Adds end-to-end integration tests for retrieval spans, attributes, and duration metrics. |
| instrumentation/opentelemetry-instrumentation-genai-langchain/tests/conformance/retrieval.py | Introduces a conformance scenario for retrieval instrumentation. |
| instrumentation/opentelemetry-instrumentation-genai-langchain/tests/test_conformance.py | Registers the new RetrievalScenario in the conformance suite. |
| instrumentation/opentelemetry-instrumentation-genai-langchain/pyproject.toml | Adjusts opentelemetry-util-genai minimum dependency version. |
| instrumentation/opentelemetry-instrumentation-genai-langchain/.changelog/124.added | Changelog entry for retrieval span support. |
| invocation.documents = [ | ||
| { | ||
| "content": doc.page_content, | ||
| **({"id": doc.id} if doc.id is not None else {}), | ||
| **{ | ||
| k: v | ||
| for k, v in cast(dict[str, Any], doc.metadata).items() | ||
| if v is not None | ||
| }, | ||
| } |
| dependencies = [ | ||
| "opentelemetry-instrumentation >= 0.62b0", | ||
| "opentelemetry-util-genai >= 1.0b0.dev", | ||
| "opentelemetry-util-genai >= 0.1b0", |
| class _FakeRetriever(BaseRetriever): | ||
| """In-memory retriever — no network calls, no embeddings.""" | ||
|
|
||
| documents: list[Document] = [] |
…python-genai into langchain-retrieval
| invocation.documents = [ | ||
| { | ||
| "content": doc.page_content, | ||
| **({"id": doc.id} if doc.id is not None else {}), |
There was a problem hiding this comment.
why doc_id falls back to an empty object? it should be a string.
It seems it's required in semconv, but given it's optional in langchain, we should mark it optional in semconv. Could you please send an issue (or PR) to update it?
And then just don't fallback to anything, leave it None
| k: v | ||
| for k, v in cast(dict[str, Any], doc.metadata).items() | ||
| if v is not None | ||
| }, |
There was a problem hiding this comment.
this seems to add all metadata properties to the document, it's not documented in semantic conventions.
| if v is not None | ||
| }, | ||
| } | ||
| for doc in documents |
There was a problem hiding this comment.
we don't ever populate score, but it should be available at least in some cases, e.g. https://reference.langchain.com/python/langchain-core/vectorstores/base/VectorStore/similarity_search_with_relevance_scores
Is there a way to get it somehow?
If it's not always available, we should also update semantic conventions to make it not required. Please create an issue (or PR) to make it optional.
cc @JWinermaSplunk who added retrival span in semconv in case he has thoughts
Retrieval span and metric,

Fixes # (issue)
Type of change
Please delete options that are not relevant.
How has this been tested?
Please describe the tests that you ran to verify your changes. Provide
instructions so we can reproduce. List any relevant details for your test
configuration.
Checklist
See CONTRIBUTING.md
for the style guide, changelog guidance, and more.